Modeling Word Durations
نویسنده
چکیده
We describe a new method of modeling duration at word level. These duration models are easily trained from the acoustic training data and can be used to rescore N−best lists of recognition hypotheses. The models capture some of the well known durational effects such as prepausal lengthening. They incorporate a simple back off mechanism to handle unseen words during rescoring. Experiments with various large vocabulary conversational speech recognition (LVCSR) evaluation sets showed consistent improvements of 0.7−1.0% in word error rate (WER).
منابع مشابه
Duration Modeling For Turkish Text-to-Speech Synthesis System
Naturalness of synthetic speech depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish by using machine-learning algorithms. The models predict phone durations based on attributes such as phone identity, neighboring phone ident...
متن کاملModeling word durations
We describe a new method of modeling duration at word level. These duration models are easily trained from the acoustic training data and can be used to rescore N−best lists of recognition hypotheses. The models capture some of the well known durational effects such as prepausal lengthening. They incorporate a simple back off mechanism to handle unseen words during rescoring. Experiments with v...
متن کاملSegmental duration modeling in Turkish
Naturalness of synthetic speech highly depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish using machinelearning algorithms, especially Classification and Regression Trees (CART). The models predict phone durations based on ...
متن کاملWord duration modeling for word graph rescoring in LVCSR
A well-known unfavorable property of HMMs in speech recognition is their inappropriate representation of phone and word durations. This paper describes an approach to resolve this limitation by integrating explicit word duration models into an HMM-based speech recognizer. Word durations are represented by log-normal densities using a back-off strategy that approximates durations of words that h...
متن کاملArabic Handwritten Word Recognition Using HMMs with Explicit State Duration
We describe an offline unconstrained Arabic handwritten word recognition system based on segmentation-free approach and discrete hidden Markov models (HMMs) with explicit state duration. Character durations play a significant part in the recognition of cursive handwriting. The duration information is still mostly disregarded in HMM-based automatic cursive handwriting recognizers due to the fact...
متن کاملSynchronizing timelines: Relations between fixation durations and N400 amplitudes during sentence reading
We examined relations between eye movements (single-fixation durations) and RSVP-based event-related potentials (ERPs; N400s) recorded during reading the same sentences in two independent experiments. Longer fixation durations correlated with larger N400 amplitudes. Word frequency and predictability of the fixated word as well as the predictability of the upcoming word accounted for this covari...
متن کامل